SQLite Async API by geeksilva97 · Pull Request #59109 · nodejs/node

geeksilva97 · 2025-07-18T02:22:12Z

This PR implements an async API for node:sqlite module. So far, it contains a very minimal implementation of exec method, misses some tests, docs and refactoring but it is good enough to share the whole theory I have for it; with that, anybody can share thoughts about it.

Docs
Tests

Design

On C++ land, I plan to have the Database class determine whether the operations will be asynchronous.

Public API

For the public API, I plan to have classes such as Database, Statement, etc., as counterparts to' DatabaseSync', ' StatementSync', and so on.

nodejs-github-bot · 2025-07-18T02:22:17Z

Review requested:

@nodejs/sqlite

geeksilva97 · 2025-07-18T02:58:15Z

Concurrency Control

For concurrency control, SQLite provides a few options. Multi-threaded seems to be a good fit.

Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection nor any object derived from database connection, such as a prepared statement, is used in two or more threads at the same time.

We just need a way to guarantee that no two threads will be using connection or statements at the same time. Since we have a threadpool, hence no control over which thread a work will be executed, seems like we need a mutex. Please share your thoughts.

geeksilva97 · 2025-07-18T03:06:21Z

We just need a way to guarantee that no two threads will be using connection or statements at the same time. Since we have a threadpool, hence no control over which thread a work will be executed, seems like we need a mutex. Please share your thoughts.

I wonder if, for the first version of this API if the serialized mode is good enough

Serialized. In serialized mode, API calls to affect or use any SQLite database connection or any object derived from such a database connection can be made safely from multiple threads. The effect on an individual object is the same as if the API calls had all been made in the same order from a single thread. The name "serialized" arises from the fact that SQLite uses mutexes to serialize access to each object.

src/node_sqlite.h

geeksilva97 · 2025-12-01T18:05:44Z

having implemented worker thread offloading i can state that there's a significant amount of gnarly lifecycle plumbing involved, especially if one wants to support transactions if some of that plumbing could live within node.js itself it would be much appreciated not sure where a line would have to be drawn be to support more advanced use cases (e.g. 1 writer, <n> reader setups) but i'm eager to test stuff out!

Yeah. I will build a solution based on the authorizer, and let's see what you think about it.

louwers · 2025-12-01T19:22:32Z

From the beginning, the idea is not to default all queries to run asynchronously. It's about to add the possibility for async.

I think the API should reflect this. If it is named Database then I think a lot of people will just go for that one, thinking fewer keystrokes! And async is better/faster! For reading files, async often makes most sense. Correct me if I am wrong, but I think for SQLite queries, most of the time, the overhead makes async the worse option, even measured in wall clock time on the main thread.

Maybe calling it something like DatabaseAsync, StatementAsync is an option?

Or maybe DatabaseSync can be renamed to Database and we can have Database.execAsync?

louwers · 2026-01-05T16:55:55Z

deps/sqlite/sqlite.gyp

        'SQLITE_ENABLE_RBU',
        'SQLITE_ENABLE_RTREE',
        'SQLITE_ENABLE_SESSION',
+        'SQLITE_THREADSAFE=2',


Is there a performance penalty for purely single-thread use cases?

If so, would it make sense to allow setting the threading mode at runtime?

I don't know for sure about any performance penalti. But as discusses in the issue, due to the node.js nature, this will work fine for sync. Better-sqlite uses it.

Understood. Thanks! This can be closed.

If anything I think this will speed up single-threaded use cases, because apparently by default SQLite uses internal serialization. Did not test it though.

(I was going to post this on the main thread, but this is the context that matters, so I'll add it here) (edit: I rewrote for brevity):

Synchronous node:sqlite operations can run while your new async operations are executing. This will cause concurrent access to the same SQLite connection from multiple threads. This violates SQLITE_THREADSAFE=2 requirements and can cause ~~db corruption~~ undefined behavior (I could have sworn I read about this on the how-to-corrupt page but I don't see it now.)

SQLITE_THREADSAFE=2 requires no connection be used by multiple threads simultaneously (docs). Currently, nothing prevents sync methods (main thread) from running while async operations are on the thread pool.

Suggested fix: throw from sync methods when has_running_task_ is true:

diff --git a/src/node_sqlite.cc b/src/node_sqlite.cc --- a/src/node_sqlite.cc +++ b/src/node_sqlite.cc @@ -1257,6 +1257,8 @@ void Database::Prepare(const FunctionCallbackInfo<Value>& args) { ASSIGN_OR_RETURN_UNWRAP(&db, args.This()); Environment* env = Environment::GetCurrent(args); THROW_AND_RETURN_ON_BAD_STATE(env, !db->IsOpen(), "database is not open"); + THROW_AND_RETURN_ON_BAD_STATE(env, db->has_running_task_, + "Cannot call sync methods while async operation is running");

Same check needed in Exec() (sync path), Function(), Aggregate(), SetAuthorizer(), and LoadExtension(). Tested locally — works as expected.

(SQLITE_THREADSAFE=1 would also prevent undefined behavior, but the main thread would silently block on SQLite's internal mutex until the async op completes, defeating the purpose of async.)

Thank you for the in-depth examples. I'd like to mention another option (which I thought has been discussed in the issue, but maybe we were talking past each other):
Have one sqlite connection for synchronous access and maintain a pool of (seperate) sqlite3 db connections for asynchronous access. For each asynchronous access one connection from the pool would be exclusively used. The main difficulty with this approach is obviously db.prepare(). One would have to track which db connection owns the prepared statement instance and only dispatch operations from that prepared statement object to its owning db connection. Another drawback of that design is the inherent potential of SQLITE_BUSY failures for concurrent write access.

Yeah, if you don’t want to cross the streams, you need multiple database instances opened. Rather than transparently managing this magically begins the scenes in a pool, though, I think it may be less surprising to actually push the complexity of managing per -database handles to the user—they can figure out what makes sense in their app, and when one query wedges another, it’s their fault.

mcollina · 2026-01-06T22:34:17Z

It turns out I was very naive while conceiving this implementation. My approach was to utilize libuv's thread pool for running SQLite operations, as we do for backups.

In a situation where a query calls a user-defined function, authorizer, or aggregation, execution on the thread pool fails because it doesn't have access to an Isolate.

My first idea was to track the defined functions and identify when they are in the query; if functions are present, that query would run in the main thread. Another option would be using worker threads that have their own isolates.

@nodejs/sqlite would you have any clue? Thanks in advance.

The only viable solution is not support custom functions with the async mode. Basically you would need to move the computation back to the main thread, causing a possible deadlock situation.

geeksilva97 · 2026-01-07T02:01:20Z

It turns out I was very naive while conceiving this implementation. My approach was to utilize libuv's thread pool for running SQLite operations, as we do for backups.
In a situation where a query calls a user-defined function, authorizer, or aggregation, execution on the thread pool fails because it doesn't have access to an Isolate.
My first idea was to track the defined functions and identify when they are in the query; if functions are present, that query would run in the main thread. Another option would be using worker threads that have their own isolates.
@nodejs/sqlite would you have any clue? Thanks in advance.

The only viable solution is not support custom functions with the async mode. Basically you would need to move the computation back to the main thread, causing a possible deadlock situation.

That sounds good to me

geeksilva97 · 2026-01-15T13:37:17Z

Here's the state of this PR by now.

Async API delegates work to the Thread pool
Classes don't have a Sync postfix (like zlib), this is what is causing most of the conflicts.
Custom functions, aggregates and authorizer are not allowed, those functions require to be executed in the main thread.
Each database instance has a queue for async tasks. With that we ensure only no connection or statement will be used in multiple threads at the same time.
Iterators remains the same as async iterator helpers are still on stage 2 (https://github.com/tc39/proposal-async-iterator-helpers)

cc @mcollina @cjihrig

geeksilva97 · 2026-01-17T18:27:06Z

From the beginning, the idea is not to default all queries to run asynchronously. It's about to add the possibility for async.

I think the API should reflect this. If it is named Database then I think a lot of people will just go for that one, thinking fewer keystrokes! And async is better/faster! For reading files, async often makes most sense. Correct me if I am wrong, but I think for SQLite queries, most of the time, the overhead makes async the worse option, even measured in wall clock time on the main thread.

Maybe calling it something like DatabaseAsync, StatementAsync is an option?

Or maybe DatabaseSync can be renamed to Database and we can have Database.execAsync?

All of them are good options but the argument of consistency is very strong. Looking at other modules with the perspective of what is exported. Zlib exports functions with sync prefix with the function is sync and without it otherwise.

Maybe this can be a turning point?

I saw that @mcollina added the tsc-agenda tag to the PR #61262. Maybe this naming thing can be discusses as well?

louwers · 2026-01-17T19:26:03Z

I don't find the consistency argument very strong, because as far as I am aware there is no precedent of postfixing a class with *Sync (versus leaving it out for async). It's only functions.

I also don't think creating a separate Database class that allows asynchronous operations is a particularly good API. It makes it harder to switch between async and sync. Due to the overhead workers incur, I expect many applications will want to only use async for a subset of queries (and ideally be able to switch between one or the other without too much fuss).

In addition, we are staying quite close to the SQLite API, which I think it a good thing to do. The current DatabaseSync class has a single SQLite database connection handle, wheras the proposed (async) Database wraps complicated machinery to juggle multiple database connection handles on different threads. I think the naming should reflect this (call it a database pool or something similar) and offer a more constrained API that makes more sense for the underlying multi-threaded implementation.

I wonder what you think about this API and if it would be feasible.

geeksilva97 · 2026-01-17T21:12:44Z

I don't find the consistency argument very strong, because as far as I am aware there is no precedent of postfixing a class with *Sync (versus leaving it out for async). It's only functions.

That's why I mentioned the perspective. If you look at what is exported, they follow this rule.

I wonder what you think about #54307 (comment) and if it would be feasible.

I like your proposal. It brings flexibility, with the async stuff under the pool attribute. I don't think we need a threadingMode though. We can keep all async under pool anyways.

If we all agree we can drop the Sync postfix - as Bun - I think we can do this.

The machinery will still have complexity because that's how it is. Every work must be splitted to run part on thread pool (SQLite stuff) part on main thread (object creation like arrays and objects, and promise resolving).

louwers · 2026-01-17T22:11:36Z

I don't think we need a threadingMode though.

Agreed. I thought this would make it possible to use single threaded mode for individual database connections, but this is not possible as pointed out by @BurningEnlightenment.

Glad you like it though!

The machinery will still have complexity because that's how it is.

Yes, but this API makes it clear that it not 'just' a SQLite database handle.

addaleax · 2026-01-27T18:24:17Z

src/node_sqlite.cc

+  } else {
+    ThreadPoolWork* next_work = task_queue_.front();
+    task_queue_.pop();
+    next_work->ScheduleWork();


As mentioned before, setting the thread safety build flag to 1/serializable is pretty much guaranteed to be at least as efficient as building our own scheduling mechanism, so I'd remove this code and instead do that

geeksilva97 · 2026-01-27T22:44:15Z

Appreciate your help, @addaleax . I will revisit this implementation and will take your suggestion in consideration as well @louwers 🚀

mceachen · 2026-02-12T05:04:03Z

src/node_sqlite.cc

  ASSIGN_OR_RETURN_UNWRAP(&db, args.This());
  Environment* env = Environment::GetCurrent(args);
  THROW_AND_RETURN_ON_BAD_STATE(env, !db->IsOpen(), "database is not open");
+  db->is_closing_.store(true, std::memory_order_release);
  db->FinalizeStatements();


If I'm reading this correctly, FinalizeStatements() and DeleteSessions() run before the has_running_task_ check at line 1238. Both call into SQLite (sqlite3_finalize, sqlite3session_delete) accessing connection_ on the main thread, while the thread pool may be executing sqlite3_exec(connection_).

The sqlite3_close_v2 is correctly deferred, but these cleanup calls should be deferred too - either move them after the has_running_task_ check, or into MaybeCloseConnection().

mcollina

I would actually prefer if sync and async code paths would not cross. In other words, if I opened a db asynchronously, I would not be able to block the main thread on it. This distinction would prevent all sort of user problems.

louwers · 2026-02-16T10:45:50Z

@mcollina That's a valid concern.

Do you have an opinion on naming? Just Database for the database pool? I prefer something else for reasons outlined here...

mcollina · 2026-02-16T11:03:31Z

I'm not even convinced we need a pool. That is not what node-sqlite3 was doing, wasn't it?

I would keep Database for the async version, yes.

louwers · 2026-02-16T13:20:37Z

Just a single thread? Not sure about node-sqlite3 but that would simplify things a lot.

mceachen · 2026-02-16T17:21:00Z

Edit: if Node.js/libuv had lightweight threads, the resource management simplicity of one thread per Database instance and Database Connection Handle would be appealing, but aren't we stuck with heavyweight OS threads?

FWIW, node-sqlite3 uses libuv mutexes and sqlite3_db_mutex to manage concurrency -- it doesn't bind connections to a thread. It's quite complex: database instances share the same libuv thread pool, each Database instance has a bool locked and bool serialize field, and depending on the operation, some mélange of mutexes are acquired and released.

mcollina · 2026-02-16T21:15:34Z

Disclosure: this comment was drafted with AI assistance and reviewed/edited by me.

I dug into TryGhost/node-sqlite3 (current master) to validate how one Database instance is used across libuv workers, and the key point is that it uses two synchronization layers for different problems:

Addon-level queueing/locking (semantic ordering + lifecycle safety)
SQLite mutexing (low-level thread safety around the shared sqlite3* handle)

1) Addon-level scheduler: when work is queued vs dispatched

A Database object owns a single sqlite3* _handle. Every async operation carries Database* db in a baton and runs via napi_queue_async_work (libuv threadpool).

Scheduling is done in Database::Schedule() / Database::Process().

Roughly:

If DB is not open/usable -> error path
Else, queue if !open || ((locked || exclusive || serialize) && pending > 0)
Else, dispatch immediately

Meaning:

pending = number of in-flight async jobs touching this DB
exclusive jobs (e.g. close/exec/loadExtension/wait) require no overlap with other in-flight jobs
serialize mode forces newly scheduled work to behave like exclusive work
parallelize mode allows non-exclusive jobs to be issued without forced serialization

Clarification of terms (important):

exclusive = operation-level scheduling flag.
That specific operation cannot start until pending == 0 (no other DB work in flight), and once started it prevents overlapping starts until it completes.
serialize = connection-level mode flag.
When enabled, every newly scheduled operation is treated as exclusive (exclusive || serialize), so work is effectively one-at-a-time and ordered.
When disabled (parallelize), only explicitly exclusive operations get that strict treatment.

Process() drains queue only when state allows it. Important guard:

if next queued call is exclusive and pending > 0, it waits in queue

So queueing here is not just about thread-safety; it enforces operation semantics and lifecycle invariants.

2) SQLite mutexes: protecting the shared connection internals

Even with the addon queue, work still executes on threadpool workers and may touch the same connection handle. node-sqlite3 therefore uses SQLite in thread-safe mode:

compiled with SQLITE_THREADSAFE=1
default open flags include SQLITE_OPEN_FULLMUTEX

Additionally, statement worker code explicitly acquires the per-DB SQLite mutex (sqlite3_db_mutex(db->_handle)) around critical sections (sqlite3_prepare_v2, bind/step/error retrieval paths, etc.).

That protects internal SQLite connection/statement state and avoids races while extracting status/error data.

Why both layers are needed

FULLMUTEX alone gives memory/thread safety, but it does not give higher-level behavior guarantees expected by the JS API. The addon queue is still needed to guarantee things like:

close runs only when prior work is drained
exclusive operations do not interleave with others
serialize/parallelize API semantics
deterministic callback/error event behavior for queued operations

So:

SQLite mutexes = “is concurrent access to this connection memory-safe?”
Addon queue/locks = “is operation ordering/lifecycle behavior correct for this API?”

That combination is what allows one Database instance to be shared across multiple libuv workers safely while preserving node-sqlite3 semantics.

BurningEnlightenment · 2026-02-16T23:11:00Z

FWIW, node-sqlite3 uses libuv mutexes and sqlite3_db_mutex to manage concurrency -- it doesn't bind connections to a thread. It's quite complex: database instances share the same libuv thread pool, each Database instance has a bool locked and bool serialize field, and depending on the operation, some mélange of mutexes are acquired and released.

The node-sqlite3 "parallel" dispatch strategy has the distinct disadvantage of very bad cache locality and thrashing the db connection mutex as you have potentially multiple uv thread pool threads waiting on a single mutex. This also introduces the risk of (temporary) thread pool starvation and SQLITE_BUSY timeouts if one query hogs the mutex for a long time. Optimally one would only ever schedule one uv thread pool task at the same time for one db connection which drains a single-producer-single-consumer (spsc) thread safe queue of query tasks while asynchronously dispatching completion events/callbacks to the main thread. And this would be perfectly fine with SQLITE_THREADSAFE=2.

mceachen · 2026-02-16T23:31:48Z

Having helped maintain the better-sqlite3 package for several years, I can't remember the last issue we had that was from a "code correctness"/thread/lock/mutex related defect (it's all build/compilation issues because it's not using n-api!). The abundant issue list for node-sqlite3 is a wholly different story--it's really hard to get right, and I personally hit several bugs with my project, which is why I switched to better-sqlite3 in the first place.

Also: the performance hit from bad cache locality (like @BurningEnlightenment mentioned) and all the mutex juggling results in 4X to 10X performance hits for some functionality: https://github.com/photostructure/node-sqlite/tree/main/benchmark#performance-results

geeksilva97 · 2026-02-17T02:47:44Z

Appreciate all your reviews here. I changed configuration of threads back to 1 (and a few other things in my local branch).

The thing is as @mcollina has pointed out. It this thing even needed? The more I worked on it the less it seemed useful.

I know there are some use cases. But for those, people can use worker threads? Having this working on a threadpool would be having a database connection with less capabilities (it won't have an isolate, hence no custom functions, aggregations or authorizers).

mceachen · 2026-02-17T03:29:17Z

@geeksilva97 Agreed: I think brief example code on how to run long running queries on a separate node:worker_threads instance would be the most conservative and most expeditious way forward (but I'm just shouting from the peanut gallery here).

My main hope was to highlight the substantial complexity that the node-sqlite3 package tackled (and how Joshua's better-sqlite3 and the current node:sqlite sync approach avoids it!)

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. sqlite Issues and PRs related to the SQLite subsystem. labels Jul 18, 2025

himself65 reviewed Aug 1, 2025

View reviewed changes

src/node_sqlite.h Outdated Show resolved Hide resolved

geeksilva97 force-pushed the sqlite-async branch from ac01d39 to ed659be Compare August 1, 2025 05:57

geeksilva97 force-pushed the sqlite-async branch from dbf9d40 to 4ffdccf Compare August 28, 2025 02:36

geeksilva97 added the wip Issues and PRs that are still a work in progress. label Aug 28, 2025

geeksilva97 mentioned this pull request Aug 28, 2025

src,sqlite: refactor value conversion #59659

Merged

geeksilva97 force-pushed the sqlite-async branch 5 times, most recently from 6561e49 to 9af39ec Compare October 1, 2025 03:39

geeksilva97 force-pushed the sqlite-async branch from 9af39ec to ed24a9b Compare October 28, 2025 02:14

geeksilva97 force-pushed the sqlite-async branch from ed24a9b to 710f3b5 Compare November 20, 2025 05:33

geeksilva97 force-pushed the sqlite-async branch 2 times, most recently from 4bf5609 to 8ce4e74 Compare November 28, 2025 04:07

src: add async api

081d7c5

geeksilva97 force-pushed the sqlite-async branch from 8ce4e74 to 081d7c5 Compare November 28, 2025 04:09

geeksilva97 marked this pull request as ready for review November 28, 2025 04:09

geeksilva97 marked this pull request as draft November 28, 2025 15:36

geeksilva97 force-pushed the sqlite-async branch from e46021e to 5e6d3a2 Compare November 28, 2025 16:04

src: cleanup sqlite async api

7acd4eb

geeksilva97 force-pushed the sqlite-async branch from 5e6d3a2 to 7acd4eb Compare November 28, 2025 16:53

geeksilva97 added 2 commits November 28, 2025 17:27

test,sqlite: test async classes

cc6263a

sqlite: add sqlite property to async database

58ca77f

geeksilva97 marked this pull request as ready for review November 28, 2025 21:16

louwers mentioned this pull request Jan 5, 2026

sqlite: mark as release candidate #61262

Open

louwers reviewed Jan 5, 2026

View reviewed changes

geeksilva97 added 3 commits January 15, 2026 00:07

src: avoid custom functions in async mode

cada892

src: handle race condition

28ba2bd

sqlite: test async ordering

45baba7

geeksilva97 force-pushed the sqlite-async branch from 1f44af2 to 45baba7 Compare January 15, 2026 04:00

test: fix lint issues

3c04b3e

addaleax reviewed Jan 27, 2026

View reviewed changes

mceachen reviewed Feb 12, 2026

View reviewed changes

mcollina reviewed Feb 16, 2026

View reviewed changes

geeksilva97 closed this Feb 18, 2026

Uh oh!

Conversation

geeksilva97 commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Design

Public API

Uh oh!

nodejs-github-bot commented Jul 18, 2025

Uh oh!

geeksilva97 commented Jul 18, 2025

Concurrency Control

Uh oh!

geeksilva97 commented Jul 18, 2025

Uh oh!

Uh oh!

geeksilva97 commented Dec 1, 2025

Uh oh!

louwers commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

louwers Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

geeksilva97 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

louwers Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

mceachen Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BurningEnlightenment Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

mceachen Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

mcollina commented Jan 6, 2026

Uh oh!

geeksilva97 commented Jan 7, 2026

Uh oh!

geeksilva97 commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geeksilva97 commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

louwers commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geeksilva97 commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

louwers commented Jan 17, 2026

Uh oh!

addaleax Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

geeksilva97 commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mceachen Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

louwers commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcollina commented Feb 16, 2026

Uh oh!

louwers commented Feb 16, 2026

Uh oh!

mceachen commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcollina commented Feb 16, 2026

geeksilva97 commented Jul 18, 2025 •

edited

Loading

louwers commented Dec 1, 2025 •

edited

Loading

mceachen Feb 12, 2026 •

edited

Loading

geeksilva97 commented Jan 15, 2026 •

edited

Loading

geeksilva97 commented Jan 17, 2026 •

edited

Loading

louwers commented Jan 17, 2026 •

edited

Loading

geeksilva97 commented Jan 17, 2026 •

edited

Loading

geeksilva97 commented Jan 27, 2026 •

edited

Loading

louwers commented Feb 16, 2026 •

edited

Loading

mceachen commented Feb 16, 2026 •

edited

Loading

BurningEnlightenment commented Feb 16, 2026 •

edited

Loading

geeksilva97 commented Feb 17, 2026 •

edited

Loading